Towards Real Time Discovery from Distributed Information Sources

نویسندگان

  • Vincent Cho
  • Beat Wüthrich
چکیده

Many successful knowledge discovery or data mining techniques and systems have been developed. These techniques usually apply to centralized databases with less restricted requirements on learning and response time. Not so much effort yet has been put into mining of distributed databases and real-time issues. In this paper, we investigate issues of fast distributed data mining. We assume that merging the distributed database into a single one would be either be too costly (distributed case); or, the individual fragments are non-uniform so that mining only one fragment would bias the result (fragmented case). The goal is to classify objects O of the database into one of several mutually exclusive classes Ci. Our approach to make mining fast and feasible is as follows. From each data site or fragment dbk, only a single rule rik is generated for each category Ci. A small subset {ri1, ...,rih} of these individual rules is selected to form a rule set Ri for each category Ci. These rule subsets represent adequately the hidden knowledge of the entire database. Various selection criteria to form Ri are discussed, both theoretically as well as experimentally. Each rule set Ri represents one category Ci saying whether an object O belongs to Ci or not. However, it may still be that O is classified by Ri and Rj as belonging to Ci and Cj respectively which is not possible since the classes are mutually exclusive. Various novel methods of finally making the classifications mutually exclusive are investigated. A superior such method is identified.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design and Test of the Real-time Text mining dashboard for Twitter

One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...

متن کامل

Shared Situation Awareness For Army Applications June 2003

The real-time manned-unmanned teaming of on-the-move Army assets will provide mobile commanders and warfighters with improved situation awareness from the sharing and fusion of heterogeneous distributed data sources information. Lockheed Martin Advanced Technology Laboratories (ATL) is improving situation awareness through three ATL-developed technologies: adaptive, modular, multisensor informa...

متن کامل

Towards Highly Configurable Real-Time Object Request Brokers

This paper discusses the software architecture of a Realtime CORBA object request broker (ORB) called ZEN, written in Real-time Java, which is designed to eliminate common sources of overhead and non-determinism in ORB implementations. We illustrate how ZEN can be configured to select the minimal set of components used by an application. Our experience with ZEN indicates that combining Real-tim...

متن کامل

Intelligent Mobile Agents for Information Retrieval and Knowledge Discovery from Distributed Data and Knowledge Sources

| Tools for selective proactive as well as reactive information retrieval and knowledge discovery constitute some of the key enabling technologies for managing the data overload and translating recent advances in automated data acquisition , digital storage, computers and communications into fundamental advances in decision support , scientiic discovery and related applications. This paper desc...

متن کامل

F-STONE: A Fast Real-Time DDOS Attack Detection Method Using an Improved Historical Memory Management

Distributed Denial of Service (DDoS) is a common attack in recent years that can deplete the bandwidth of victim nodes by flooding packets. Based on the type and quantity of traffic used for the attack and the exploited vulnerability of the target, DDoS attacks are grouped into three categories as Volumetric attacks, Protocol attacks and Application attacks. The volumetric attack, which the pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998